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Abstract 

Time-varying fading channels present a major challenge in the design of wireless communication 
systems. Adaptive schemes are often employed to adapt the transmission parameters to receiver-based 
estimates of the quality of the channel. We consider a pilot-based adaptive modulation scheme without 
the use of a feedback link. In this scheme, pilot tones (known by sender and receiver) are periodically sent 
through the channel for the purpose of channel estimation and coherent demodulation of data symbols 
at the receiver. We optimize the duration and power allocation of these pilot symbols to maximize 
the information-theoretic achievable rates using binary signaling. We analyze four transmission policies 
and numerically show how optimal training in terms of duration and power allocation varies with the 
channel conditions and from one transmission policy to another. We prove that for a causal estimation 
scheme with flexible power allocation, placing all the available power on one pilot is optimal. 

Index Terms 

Adaptive modulation, pilot symbol assisted modulation, fading channels, Rayleigh fading, power 
allocation, training duration. 


I. Introduction 

In digital mobile communications, fast fading degrades the Bit Error Rate (BER) of the channel 
and inhibits coherent detectiorQ. It is known that performance is limited by channel estimation 
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errors |[I1-|I4]|. Pilot Symbol Assisted Modulation (PSAM) is a teehnique that has been introdueed 
in dH to mitigate these effeets. In this seheme, known training symbols (pilots) are periodieally 
inserted into the data frame for the purpose of ehannel estimation and eoherent demodulation 
of the data symbols. 

Furthermore, ehannel-adaptive modulation dynamieally adjusts eertain transmission parameters 
such as the constellation size, transmitted power, and code rate according to the channel quality. 
Adaptive signaling provides in general higher bit rates (relative to conventional nonadaptive 
methods) by increasing the transmission throughput under favorable channel conditions and 
reducing it as the channel condition is degraded. 

Some of the previous adaptive schemes rely on a channel-feedback link to provide the transmit¬ 
ter with the Channel Side Information (CSI) Q, |I71. In [jUl, the authors consider employment 
of adaptive modulation with one pilot in addition to delayed feedback to the transmitter and 
prove that power adaptation via periodic feedback can increase the achievable rates. Similarly, 
in [jUl, authors consider pilot-based adaptive modulation where estimate is fed back to transmitter 
in order to adapt data and pilot power and study the optimal policy for power allocation for 
data and pilot symbols. Authors in lUOll discuss adaptive modulation with feedback and develop 
an adaptive scheme that accounts for both channel estimation and prediction errors in order 
to meet a target Bit Error Rate (BER). In [fTTII . the authors attempt to optimize the spectral 
efficiency subject to a specific BER constraint in a pilot-based adaptive modulation setup with 
feedback. The above mentioned works study the performance of such systems and prove adaptive 
modulation using pilots can increase the achievable rates in general. However, systems that rely 
on a channel-feedback link present some disadvantages because of the modeling complexity on 
one hand and its infeasibility on the other hand when the channel is fading faster than it can 
be estimated (or predicted) and fed back to the sender. Optimizing the pilot placement, power 
allocation and modulation schemes in a pilot-based setup is an active area of research, whether 
in the case of a single receiver lU^ - lfTTl or multiple receivers [IT7l - lfT9l . 

A modified pilot-based adaptive modulation scheme over Rayleigh fading channels was pre¬ 
sented in [|20ll . This scheme adapts the coded modulation strategy at the sender to the quality of 
the channel estimation (estimation error variance) at the receiver without requiring any channel 
feedback. In this work we study the performance of this non-feedback adaptive modulation 
scheme over time-varying Rayleigh fading channels. Unlike the scheme in ll20ll . we consecutively 
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send a cluster of k pilots (k > 1) per data frame with k being an optimization variable OTlI . 
We determine the optimal duration and power allocation of the training period under different 
transmission policies for both causal and non-causal estimation. We study such systems at low 
Signal-to-Noise-Ratio (SNR) (we consider the received SNR) levels and the performance is 
measured in terms of achievable rates using binary signaling. We prove that the “optimal” power 
allocation scheme which minimizes the error variance of the estimates of the channel parameters 
-which is set up offline without requiring feedback- in case of causal estimation is the one in 
which all the available power is allocated on one pilot, if constraints allow it. 

The organization of this paper is as follows. In Section UIl we present the fading channel model, 
the adaptive transmission technique we use to transmit over the channel as well as the receiver 
details. The measure of performance is discussed in Section [Till the optimal power allocation for 
causal estimation is proved in Section |IV] and the numerical results are presented in Section |Vl 
In Section we present possible extensions to other fading models and Section IVIII concludes 
the paper. 


II. Preliminaries 

A. The Channel Model 

Consider the single-user discrete-time model for the Rayleigh fading channel, 

= Rid^i + Nil 

where i is the time index, Xj G C is the channel input at time i, Fj G C is its output, and Ri and 
Ni are independent complex circular GaussianJ^ random variables with zero mean and variance 
and af respectively. The amplitude of the fading coefficient Ri is then Rayleigh distributed 
and its phase is uniform over [—vr, vr). To account for power constraints, the input is subject to 

E 

for some parameters {Pi] -that could be all equal to a constant for example. Since from 
an information theoretic perspective scaling the output by l/an does not change the mutual 
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A complex Gaussian random variable is circular if and only if it is zero-mean and its real and imaginary parts are independent 


with equal variances. 
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information, we assume without loss of generality that a/j = 1. The varianee of the noise is 
to be generally interpreted as {cr%/ajf). 

We assume in this study that the fading proeess follows a stationary first-order Gauss-Markov 
model introdueed in ^22^ . i.e., 

Ri = aRi-i + Zi, ( 1 ) 

where the samples {Z^} are Independent and Identieally-Distributed (IID) eomplex eireular 
Gaussians with mean zero and varianee equal to (t| = (1 — sueh that a G [0,1) to guarantee 
stationarity. 

Even though we analyze the benefits of pilot elustering by assuming that the autoeorrelation 
funetion of the fading proeess is derived from a stationary first-order Gauss-Markov model ([I]), 
we argue in Seetion|Vl]that the methodology may be readily adapted to other models and present 
the ease of a Jakes’ model ll23l that takes into aeeount higher orders of eorrelation. 

B. The Adaptive Transmission Scheme 

At regular intervals, the transmitter sueeessively sends k known pilot symbols whose purpose 
is to enable the estimation of the ehannel at the reeeiver. The ehannel estimation is solely based 
on the pilot symbols and no data-direeted estimation is used. For eaeh time sample i, the reeeiver 
eomputes the Minimum Mean-Square Estimate (MMSE) of the ehannel, the quality of whieh 
-measured through the estimate error varianee- depends on its position with respeet to the 
pilot symbols. After estimation, the ehannel, as seen by the reeeiver, is a Rieian ehannel whose 
speeular part is given by the estimate and whose Rayleigh eomponent is given by the zero-mean 
Gaussian-distributed estimation error. 

Although the seheme is adaptive, it does not use feedback to determine its poliey. The key idea 
is that the transmitter adapts to the quality of ehannel estimation (speeifieally to the mean-square 
error whieh is independent of the value of the estimate available only at the reeeiver) rather than 
the estimate of the ehannel. Sinee the estimation error varianee is eomputed offline, the adaptive 
transmission seheme ean then be determined offline as well and adopted by the transmitter. Even 
though three is no feedbaek to the transmitter, it is aware of the statisties of the estimation error 
beforehand. 

The transmitter employs multiple eodebooks in an interleaved fashion as shown in Figure [B 
It adapts its throughput to the estimation error varianee by eoding the data symbols aeeording 
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to their distance from the training pilots. Symbols that are far away from the pilots encounter 
poorer channel estimates at the receiver and are therefore coded with lower rate codes, while 
closer symbols benefit from small estimation error variance and are coded with higher rate codes. 


Codebook 1 

Codebook 2 


Codebook m 

Rate Ri 

Rate i?2 


Rate Rm 

$$$••• 


%%%••• 

--- 

& & & • • • 





Fig. 1. Multiple Codebook Interleaving 


We only consider binary signaling. The motivation for this choice is multiple folds. First, 
in [[2^ the authors prove that for discrete-time memoryless Rayleigh fading channels subject to 
average power constraints, the capacity achieving distribution is discrete with a finite number 
of mass points. Moreover, a binary distribution was found to be optimal at low and moderate 
values of SNR [l24 | - lf2^ . Second, for a memoryless Rician fading channel, Luo ETll established 
a similar result that, combined with Gallager’s in [l25ll . implies that the binary input distribution 
is asymptotically optimal at low SNR If27l . Consequently, we choose the alphabet of every 
codebook to consist in general of two symbols: 

{ nil = Ui + jbi with probability pi 

m2 = 02 + jb2 with probability p2 = {1 — Pi). 

The rate of the codebooks is adjusted by modifying the probability distribution of the mass 
points. Numerical results in [fT2ll . If20l indicate that the optimal mass points always lie between 
the extremes of on-off keying (optimal for the IID Rayleigh fading case where no CSI is available 
at the receiver) and the antipodal signaling (optimal for a perfectly known channel). It is worth 
noting that some of the work in the literature consider these two extremes for designing the 
constellation mapping and try to optimize the transmission model in the case of imperfect CSI 
based on the SNR level lfT2ll . Moreover, any rotational transformation of the two mass points 
will not affect the mutual information lf24l . ifZTll . Therefore an optimal input distribution consists 
of two mass points mi, m 2 G M* with —\fP < mi < 0 and m 2 > \fP. 
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C. Channel Estimation at the Receiver 

Given a pilot spacing interval T, we send k pilots in the beginning of every data frame as 
shown in Figure [2l When transmitting a pilot at time index i, the input of the channel is y/Pi 
and its output is, 

Yi = s/PiRi + Ni, i = 0, • • • , fc - 1. 




■\/Pk-l 


[Rj , Vj 


0 1 k 


T 


T — k Data Symbols 


T Pilot Spacing Interval 


Fig. 2. Pilot Symbols and Channel Estimation 


On the receiver side, we perform MMSE estimation based on the received signal during 
training. More precisely, we denote by S the set of indices corresponding to the received pilots 
involved in estimating Rj for j = k,... ,T — 1. Therefore, when iS = {0,...,A; — l}we 
say we are performing causal MMSE estimation, and when S = {0,..., k — 1, T,... ,T + k — 1} 
the MMSE estimate is said to be non-causal. 

Next, we compute the MMSE estimate Rj of Rj for j = fc,..., T — 1. Since the 

random variables {Rj, are jointly Gaussian, the MMSE estimator is linear and is 

identical to the Einear Eeast-Square Estimator (EESE) the error variance Vj of which is. 




A 


-1 


A 


T 


( 2 ) 


where ARj,{Ys}ses *^he cross-covariance matrix between Rj and and is the 

autocovariance of the vector of received pilots {EsIsgs- 

We note that the estimation error variance in equation (l2l) may be computed offline at design 
time -and therefore no feedback is needed to the encoder- and is only dependent on the 
autocorrelation function of {Rj}, the transmitted pilots and the noise spectral density. 
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III. Achievable Rates 

We consider the transmission scheme shown in Figure [2] with symbols sent with power Pj for 
j = 0,..., T — 1. Given a sample the received symbols can be written 

Yi = RiXi + Aj = Xj + Aj, for i = k,... ,T — 1, 


where Fj is a zero-mean complex Gaussian error term that has a variance Vi. Therefore, 

.g _ 


p {yi\xi, {ys}s^s) = RfciRiXi, Vi\xi\ + 


CTn I = 


71 [vi \xif + a%) 

When ignoring the fading correlation from one transmitted frame to another, the mutual 
information per symbol due to interleaving can be written as 

r 1 '^-1 

f I ( I ) = E,v.,, 


G<S 




i=k 


T-1 


T 


Ee 

i=k 


Ri 


liXi^YilR, 


, ( 3 ) 


where the expectation is now over the random variable Ri. Note that Ri is a linear combination 
of the observations 

Ri = ^ — Fj ~ A/c( 0 , 1 — Vi). ( 4 ) 




A. The Computation Method 


The term, Ea 


Ri 


I[X,-Y,\R, 


, in equation © depends on the choice of the corresponding 
binary probability distribution fully characterized by the three parameters This 

distribution (for i = fc,..., T — 1) determines the rate of the corresponding codebook and should 
be chosen to maximize the mutual information quantity in Therefore, we are interested in 
solving 


T-l 


max E/ 


l{XpY,\R, 


( 5 ) 


— 7 liiCCW l_ D 

T ® 

i=k 

subject to E [|Aip] < Pi for alH = fc,..., T — 1. 

Furthermore, examining the probability law @ of Ri indicates that the elementary quantity 

in ® is only a function of the estimation error variance Vi and 


max E 


Ri 


I[Xf,Yi\R, 


{mi,m2,p}i 

power Pi of the symbol. We define 


RubiPi, Vi) = max E^, 

{mi,m 2 ,p}i * 


nXi-R\R, 
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where the maximization is subject to E [|Xj P] < Pi and Ri ~ A/c(0,1 — Vi). Thereafter the 
achievable rates become 

T—1 

i I (; {y.if.V I {y}„s) = ^ E "<)• ® 

i=k 

The two dimensional curve RubiP, v) is computed over a fine grid P = {0 < P < Pmax, 0 < 
f < 1} as shown in Figure [3l Then given a transmission strategy consisting of an inter-pilot 
spacing T, /c-pilot clustering, and a power allocation Pj for j = 0,... ,T — 1, we calculate 
using equation Q the estimation error variance Vj for j = k, ... ,T — 1. The corresponding 
elementary mutual information quantity RubiPj, Vj) can now be interpolated from the data set 
{P, IsubiP, and used to compute the normalized sum in ®. 



Fig. 3. The two-dimensional curve Isub{P, v) 


Finally, note that the error variance is a function of the power of the pilots {Ps}sg 5 . Hence 
equation ® can also be written as 

I ( WU; {y}£i‘ I {n}.«) = ^ E {p,].^s ). (7) 

i=k 

B. The transmission policy 

We consider four types of transmission policies and we study how the optimal training strategy 
differs from one policy to another, analytically in Section |IV] and numerically in Section |Vl 
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1) Policy I: The pilot symbols and the data symbols are transmitted with the same amount 
of power, i.e., 

Ps = P, \/s = 0,... ,k — 1 & Pi = P, Wi = k,... ,T — 1. 

Therefore, for a given ehannel model, fc-pilot training, and an inter-pilot spaeing T, the 
aehievable rate in equation ([7]) is a funetion of P only. 


2 ) Policy II: In this poliey, a flat power alloeation is adopted for both the pilot symbols and 
the data symbols, but we allow the two levels to be different. More preeisely, 

Ps = Ptr Vs = 0,..., fc - 1 & Pi = Pd, Vi = fc,..., T - 1. 

The aehievable rate is a funetion of Ptr & Pd whieh satisfy 

T—1 

^ E ^ S -P- 

j=0 


3) Policy III: Following a flat power alloeation for pilots (Pg = Ptr^^s = 0,..., A; — 1), the 
data symbols are sent with power P, for i = k,... ,T — 1. These power levels satisfy 


1 

T 


T-l 


Ep. 


1 

T 


T-l 


kPtr + Pj 


j=k 


< P. 


4) Policy IV: We send both the pilots and data symbols with variable power Pj for j = 
0,..., T — 1. The eonstraint on the power levels is now given by 

^Ep.sp- 

j=0 

IV. Optimal Power Allocation for Causal Estimation 

In this seetion we find the optimal power alloeation and training duration for polieies II, III 
and rV under eausal estimation. These optimal solutions are found by applying the result of 
Theorem [T] stated hereafter. The theorem implies that if we let kPtr be the total power “budget” 
for the training period, everything else being equal, among all the training power alloeation 
sehemes {Ps}s=o that 

fc-i 

^P^ = kPtr, 

s=0 


(8) 




the optimal one is the one where all the power is allocated to the last time slot {k — 1). 
For causal MMSE estimation = — 1} and equation dH) can be written as 


Vj 1 ^Rj,{Ys}seS 


= 1 - {a^-^ ■ ■ ■ l) [D A l] D 


( 9 ) 


V ^ / 

where A is the k x k symmetric, positive definite autocovariance matrix of the channel fading 
coefficients {RsJsgs, and B is the k x k “input” matrix: 



^ 1 a 

■ 


/VA 0 

0 \ 

A = 

a 1 


, D = 

0 ■ ■ 

0 


■ 1 J 

\ 0 0 

y/Pk-ij 


A power allocation that minimizes the error variances of the estimates for all {j}’s -subject 
to the power constraint ([8])- is naturally an optimal one. Examining dH), we note that a power 
allocation that minimizes Vj^ for some jo will also minimize Vj for all {j}’s, as it will be one 
that maximizes 


/ \ 




1) D'^ [DAD^ + a%l] ^ D 


a 


( 10 ) 


V ^ / 


The power allocation that maximizes dlQl) is the subject of the following theorem, proven in 
Appendix El 


Theorem 1. The expression dTO]) is maximized when all the available power is allocated to the 
last pilot, i.e., Pj = 0, for all 0 < j < (A; — 2) and Pk-i = kPtr- 

We note that Theorem [T] holds whenever one allows the power allocation during the training 
period to vary across the pilots. We also note that the result holds irrespective of how the power 
is allocated for the data. 
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Implications on the training duration: 

• When considering policy IV, the powers of the individual training symbols are allowed 
to vary and the theorem states that all the power should be allocated to the last training 
symbol. Factoring in the loss of achievable rates due to training, it becomes clear that the 
optimal duration is that of one pilot transmission. 

• Since the achievable rates using policy III are less or equal to those of policy IV, and since 
the optimal solution for policy IV is that of a “flat” power allocation over the duration of 
the training -which is one, then the solution is also optimal for policy III. 

• Finally, since the statement of the theorem is valid irrespective of how the power is allocated 
during data transmission and specifically even when a flat power allocation is used, the result 
implies that for policy II, using a training duration of one pilot is optimal as well. 

Naturally, these statements are true if the power level during training is optimized. In Section IVl 
we validate numerically these results. 

V. Numerical Results 

For a given channel model, a given SNR (power constraints), and estimation technique (causal 
or non-causal), we numerically determine the optimal training strategy consisting of: 

1. The duration of training or the number of pilots k. 

2. The inter-pilot spacing T. 

3. The power allocation for the pilots and data symbols in a transmitted frame, according to the 

transmission policy used. 

In our work, the quality measure is the achievable rates which we compute for pilot cluster¬ 
ing/training period of up to six pilots in each frame. We study the low received SNR regime 
(SNR values of -3dB, OdB, 3dB, and 6dB) for a first-order Gauss-Markov fading process with 
values of a = 0.9,0.95,0.97, and 0.99. On the receiver side, causal and non-causal estimation 
are investigated. We present hereafter graphs for some chosen test cases and compare the rates 
achieved using 1, 2, 3, 4, 5, and 6-pilot clustering strategies for different scenarios of SNR and 
fading correlation levels. 

We note first that the numerical results confirm the observation previously made that the achiev¬ 
able rate in equation (|3]) depends on the choice of {mi, m 2 ,Pi}i, i.e., the input distribution of the 
i-\h symbol. As the symbol gets further away from the training pilots, the channel estimation 
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quality (measured through the estimate error varianee) is degraded and henee the amount of 
information sent over the ehannel deereases. This is translated by shifting {mi,m 2 ,Pi}i from 
the antipodal distribution (optimal for a perfeetly known ehannel) with pi p 2 (high entropy) 
toward the other extreme of on-off keying (optimal for the IID Rayleigh fading ease) with 
Pi ^ P 2 (low entropy). 

We also note that in the ease of eausal estimation, our numerieal results are eonsistent with 
the results in Seetion 

A. Results for Transmission Policy I 

For transmission poliey I, pilot elustering proves to aehieve higher rates under eertain eon- 
ditions eompared to the 1-pilot seheme. In Figure IH for an SNR = OdB, a = 0.99 and eausal 
estimation, training with 4 pilots and inter-pilot spaeing of T=29 symbols is optimal. A pereent 
inerease of 8.2% in information rate is aehieved relative to the best rate aehievable with a 1-pilot 
seheme. The results for other test eases are shown in Table HI 


Achievable Rates at SNR =0 dB & a =0.99 (First Order-Causal) 



Fig. 4. Achievable Rates for policy I, for SNR = OdB and a = 0.99 with Causal Estimation. 


However there are some seenarios when pilot-elustering is not useful. For the ease when SNR 
= 6dB , a = 0.97 and eausal estimation, the 1-pilot seheme presents optimal rates. 
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TABLE I 

Achievable Rates for Different Transmission Policies 


Test Case 

Policy I 

Policy II 

Policy III 

Policy IV 

a = 0.9 

SNR = OdB 

Causal Estimation 

nP=4 

T=29 

Rate«0.2247 

8.2%t‘ 

nP=l 

T=22 

Rate«0.2418 

1.6%f 

nP=l 

T=23 

Rate«0.2422 

nP=l 

T=22 

Rate«0.2422 

a = 0.97 

SNR = 6dB 

Causal Estimation 

nP=l 

T=15 

Rate«0.3782 

nP=l 

T=15 

RateR:;0.3829 

1.2%t^ 

nP=l 

T=15 

Rate«0.3836 

nP=l 

T=15 

RateR:;0.3836 

a = 0.97 

SNR = -3dB 

Non-Causal Estimation 

nP=3 

T=19 

Rate«0.1374 

4.3%t‘ 

nP=l 

T=18 

Rate«0.1470 

6.9%^ 

nP=l 

T=18 

Rate«0.1472 

nP=l 

T=18 

Rate«0.1472 

Using Jakes’ model: 

fd = too Hz, fs = 10 KHz 

SNR = 3dB 

Causal Estimation 

nP=2 

T=14 

Rate«0.3224 

4%t‘ 

nP=l 

T=13 

Rate«0.3508 

8.8%t^ 

nP=l 

T=13 

Rate«0.3510 

nP=l 

T=13 

Rate«0.3510 

Using Jakes’ model: 

fd = 100 Hz, fs = 10 KHz 

SNR = OdB 

Non-Causal Estimation 

nP=4 

T=29 

Rate«0.2843 

16%t' 

nP=l 

T=30 

Rate«0.3064 

nP=l 

T=30 

Rate«0.3064 

nP=l 

T=30 

Rate«0.3064 


* relative to the rate achieved by the 1-pilot scheme (Policy I). 
^ relative to the achievable rate under Policy I. 


Moreover in Figure [5] at an SNR=0dB, and a=0.9 with eausal estimation, training is not 
benefieial in the first place because the information rate is less than that achieved over an IID 
Rayleigh fading channel. 

As a conclusion, we can distinguish three cases. The first is when training is not applicable. 
The second is when the 1-pilot scheme gives the highest rates. And finally the third when pilot 
clustering is beneficial. From our numerical results, we note that as SNR increases and coherence 
time decreases, clustering becomes useless and the whole scheme is pushed toward the 1-pilot 
training strategy and even to the extreme case of no training at all. This is directly related to 
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Achievable Rates at SNR =0 dB & a =0.9 (First Order-Causal) 



Fig. 5. Achievable Rates for policy I, for SNR = OdB and a = 0.9 with Causal Estimation. 


the fact that training is inefficient (less CSI) when fading decorrelates quickly or when SNR is 
high. 

B. Results for Transmission Policy II 

In this policy, the pilots are sent with fixed power Pt^ {per pilot) and so are the symbols 

that are transmitted with power Pd (Section ITTT-B2I) . such that — Pj < P. Therefore, the 

^ j=o 

optimal training strategy includes determining the optimal power allocation {Ptr and Pd) for the 
transmitted frame. Here the notion of SNR is naturally associated with the average power P. 

Figure [6] shows the achievable rates for SNR=0dB, and a=0.99 with causal estimation. Unlike 
the results for policy I (Figure ID), training with 4 pilots is not optimal anymore. The 1-pilot 
scheme (with T=22) now offers 7.6% increase in the achievable rate compared to the 4-pilot 
scheme for policy I. The corresponding optimal power allocation across the transmission frame 
is shown in Figure |7l 

The rest of the results are presented in Table U and they all confirm that, as expected pilot 
clustering is not optimal for policy II, and for any transmission strategy where the pilots’ power 
is subject to optimization for that matter. In this case, the transmitter decreases the estimation 
error variance (higher throughput) by boosting the power of the single pilot instead of increasing 
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Achievable Rates at SNR =0 dB & a =0.99 (First Order-Causai) 



Fig. 6. Achievable Rates for policy II, for SNR = OdB and a = 0.99 with Causal Estimation. 


Symboi Power Aiiocation at SNR =0 dB & a =0.99 (First Order-Causai) 



Transmission Frame 


Fig. 7. Optimal Symbol Power Allocation (one frame) for policy II, for SNR = OdB and a = 0.99 with Causal 
Estimation. 
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the number of pilots k and getting penalized by the normalizing term — in equation (|7]). 

If a peak power eonstraint is imposed on the power of the pilots, the optimal training duration 
will not neeessarily be one pilot. This can be seen from Figure [8] which shows the optimal 
power allocation across the transmission frame for SNR=0dB, and a=0.99 with causal estimation 
whenever a peak constraint Ptr < 3P is imposed. This constraint is effectively imposing a 
maximum Peak-to-Average Power Ratio (PAPR) value of 3. 


Symbol Power Allocation at SNR =0 dB & a =0.99 (First Order-Causal) 


10 


8 - 


99 


Pilot Symbol 
] Data Symbol 


10 15 20 

Transmission Frame 


Fig. 8. Optimal Symbol Power Allocation (one frame) for policy II, for SNR = OdB and a = 0.99 with Causal 
Estimation and peak constraint Ptr < 3P. 


As mentioned earlier, there are some scenarios where training is not useful and the rate is 
always less than that achieved over an IID Rayleigh fading channel. This is observed with causal 
estimation for an SNR=0dB and a=0.9 for example. In that case all the power is allocated to 
the data symbols indicating that training is not beneficial. 

C. Results for Transmission Policy III 

For policy III, we send the data symbols with varying power as we hold on to a flat power 
allocation for the pilots. As already shown in the Section |IVl clustering is not useful for this 
case as well. The transmitter boosts the power of the single pilot used in training to decrease 
the error variance and increase the achievable rate. 
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The numerical results are in accordance with those of Section |IV] and they show how the power 
of the symbols is adapted to the estimation error variance. In Figure HI the power allocated to 
each symbol and the variation of the error variance are presented for an SNR=0dB, and a=0.99 
with non-causal estimation. This shows that symbols with lower variance are sent with higher 
power and vice versa. However we should note that power variations among the data symbols 
is not profound. 


Symbol Power Allocation at SNR =0 dB & a =0.99 (First Order-Non-causal) 



0 5 10 15 20 25 30 35 

Transmission Frame 


Fig. 9. Optimal Symbol Power Allocation (one frame) for policy III, for SNR = OdB and a = 0.99 with Non-Causal 
Estimation. 


The achievable rates for other cases are summarized in Table HI It is noticed that adapting 
the symbol power to the quality of estimation introduces a slight increase in achievable rates 
compared to policy II. As a result, one can say that uniform power allocation for the data symbols 
is sufficiently close to optimal and presents a more practical transmission strategy. 

D. Results for Transmission Policy IV 

Here both the pilots and data symbols are sent with varying power (Section riII-B4h . However 
from the results for transmission policy III, we already know that sending the data symbols with 
uniform power is very close to optimal. 

Let us consider the case for an SNR=0dB, and a=0.99. We choose a 4-pilot training scheme. 
For causal estimation, the power allocated to the pilots is shown in Figure [TOl We notice that all 
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of the power was found numerieally to be alloeated to the pilot elosest to the symbols leaving the 
rest of the pilots that are further away with no power and therefore useless, whieh is eonsistent 
with the results of Theorem [T] and Seetion |IVl Combining this result with the penalty faetor ^ 
in equation ([3]), we reaeh the eonelusion that the 1-pilot seheme is always optimal (Table U)- 


Symbol Power Allocation at SNR =0 dB & a =0.99 (First Order-Causal) 



Fig. 10. Optimal Symbol Power Allocation (one frame) for policy IV, for SNR = OdB and a = 0.99 with Causal 
Estimation. 

A similar result is shown in Figure dU for the non-eausal estimation seenario. The powers of 
the first pilot (playing a prominent role in the non-eausal part) and last pilot (with a prominent 
role in the eausal part) are inereased. 

Whenever a peak power eonstraint is imposed on the power of the pilots, the optimal training 
duration will potentially involve pilot elustering. The optimal duration and power alloeation in 
Figure [I^ are for an SNR=0dB, and Q!=0.99 with eausal estimation whenever a peak eonstraint 
Ptr < 3P is imposed. 


VI. Other Fading Process Models 

Whenever the fading proeess follows a different model, appropriate results may be readily 
derived as the numerieal optimization is only dependent on the autoeovarianee funetion of the 
proeess as seen from equation dH). In what follows, we present sample results using Jakes’ model. 
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Symbol Power Allocation at SNR =0 dB & a =0.99 (First Order-Non-causal) 



Fig. 11 . Optimal Symbol Power Allocation (one frame) for policy IV, for SNR = OdB and a = 0.99 with Non-Causal 
Estimation. 


Symbol Power Allocation at SNR =0 dB & a =0.99 (First Order-Causal) 



Transmission Frame 


Fig. 12. Optimal Symbol Power Allocation (one frame) for policy IV, for SNR = OdB and a = 0.99 with Causal 
Estimation and peak constraint Ptr < 3P. 
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Jakes’ Model 

In Jakes’ model |[2^ . the normalized (unit variance) continuous-time autocorrelation function 
of the fading process is given by 

= Jo(27r/dr), 

where Jo(.) is the zeroth-order Bessel function of the first kind and fd is the maximum Doppler 
frequency. For the purposes of discrete-time simulation of this model It28l . the autocorrelation 
sequence becomes 

(I)rr[ 1] = Jo{27ifdT,\l\), 

where l/Tg is the symbol rate. 

In Table U we list a sample of the results obtained for a bandwidth fg = 10 kHz and a Doppler 
shift of fd = 100 Hz. For example, optimal training consists of fc = 4 and T = 29 when we 
have an SNR=0dB and non-causal estimation. Throughput is improved by 16% in this case. 

VH. Conclusion 

We studied the performance of the non-feedback pilot-based adaptive modulation scheme [|20ll . 
11211 . [|29]| over time-varying Rayleigh fading channels. We measured the performance in terms 
of achievable rates using binary signaling and we investigated the benefits of pilot clustering as 
well as power allocation. 

We introduced a modular method to compute the rates in an efficient manner. Moreover, four 
types of transmission policies were analyzed. For each policy, we determined the optimal training 
strategy consisting of: 

1. The duration of training. 

2. The inter-pilot spacing. 

3. The power allocation for the pilots and data symbols in the frame. 

Pilot clustering proved to be useful in the low SNR-high coherence time range where training 
is efficient (Policy I). However, when the pilot power is subject to optimization (Policies II, 
III and IV), training for a smaller period but with boosted power becomes more beneficial than 
training with more pilots. We proved that the optimal training duration using causal estimation is 
indeed one whenever the power level during training is optimized and allowed to take arbitrary 
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values. Numerical results suggest that this is also the case when using non-causal estimation at 
the receiver. 

We also noted that the numerical computations indicate that a flat power allocation across the 
data slots in a frame is very close to optimal whenever the pilot power is subject to optimization. 

On the other hand, training is useless in the high SNR-small coherence time range and the 
rate is always less than that achieved over an IID Rayleigh fading channel. Several test cases 
are shown throughout this work to analyze how optimal training varies with channel conditions 
and from one transmission policy to another. 

Extensions to this work can include adaptive schemes that integrate temporal and spatial 
components like the Multiple-Input Multiple-Output (MIMO) scenario. 

Appendix A 

In this appendix we provide a proof for Theorem [IJ For notational convenience, define 

^ = D'^ [D A + al l] D V, 

where V = l)^, A is the k X k symmetric, positive definite autocovariance 

matrix of the channel fading coefficients {Rs}ses^ and D is the k x k “input” matrix: 



^ 1 a ■ 

■ 



0 ■■ 

0 \ 



A = 

a 1 

■ 

, = 

0 

■■ 

0 

, ^ = 




• 1 J 


V 0 

0 ■■ 

• \/Rk-l) 


V 1 / 


Note that 0 < 1 for any k > 1 because 0 = 1 — Vk-i- We establish first the following lemma: 

Lemma 1. Let U be a kxk diagonal matrix with non-negative entries {xi}^ZQ on the diagonal. 
Among all the permutations of the {xi}’s, the one that maximizes [A + U]~^ V is one where 
the diagonal entries are in non-increasing order. 

Proof: Assume that {xi}\zl are in the following order: 0 < Xq < Xi < • • • < Xk-i. 
We prove in what follows that U = diag{xk-i, Xk- 2 -, • • • , a:o) maximizes V'^ [A -{-U]~^ V using 
induction on fc. To highlight the dependence on k we denote pk = Vk + Uk]~^ 14, which is 
a positive quantity due to the positive definiteness of [Ak -f Uk]~^. 
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a) Base Cases: For k = 1, ^pi = and the statement holds. Examine now the case 


k = 2 : 


U = diag{xo,xi) (p 2 = 

U = diag{xi,Xo) (p 2 = 


+ Xq — + 1) 

(xo + l)(a;i + 1) - 
(a^xo + xi — + 1) 


(xi + l)(xo + 1) - 
Since a < 1 and xi > xq, the second value is larger. 

b) Induction Step: Suppose the property holds true up to A; — 1 (A; — 1 > 2) and we prove 
in what follows that it holds true for k: 

( 

k —2 

^k= ... l) [Ak + Uk]- 


1-1 


a 


\ 1 / 

where Ak and Uk are square matrices of size k. We prove that cpk is maximized when {xi}^~Q 
are placed in non-increasing order on the diagonal matrix Uk- The proof proceeds as follows: 
We first “fix” Xk-i on the last diagonal entry of Uk and prove that should be in a non¬ 

increasing order to maximize cpk- Next, we “fix” {xij^lQ on the first {k — 2) diagonal entries of 
Uk and we prove that, if Xk -2 < Xk-i, having U = diag{xo,xi, ■ ■ ■ , Xk- 3 , Xk-i, Xk- 2 } (versus 
U = diag{xQ,Xi, ■ ■ ■ , Xk- 3 , Xk- 2 , Xk-i}) gives us a larger value of pk, completing the proof. 

• Using a block form, we write [Ak + Uk] as: 

E F 

where 


Ak + Uk — 



^Xq -f 1 a 

. \ 



E = 

a Xi + 1 ■ 

CO 

1 

, F = 




. a;A:-2 + l/ 


\ a ) 


= aVk-i & G=ixk-i + l]- 


This allows us to express [Ak + Uk] ^ as 

'^-1 + E-^F[G - F^E-^F]-^F^E-^ 
-[G - F^E-^F]-^F'^E-^ 


[Ak + Uk]-^ = 


-E-^F[G - F^E-^F]-^ 
[G - F^E-^F]-^ 
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LPk = [E-^ + E-^F[G - F^E-^F]-^F^E-^] F - [G - F^E-^F]-^F^E-^F 


- F'^'E-^F[G - F^E-^F]-^ + [G - F'^'E-^F]-\ 


T IT' —1 171—1 


which reduces to: 


^ . (1-F^E-^F)^ ^ . (xk-i + l-F^E-^F-Xk-if 

[xk-1 + 1 - F^E-^F] + [xk-i + 1 - F^E-^F] 

= {-Xk-i + 1) + + 1 _ f^E-^F] ■ 

The scalar F'^E~^F is equal to a^ipk-i- Indeed, F is of size (/c — 1) x 1 and equal to aVk-i, 
and i? is a (/c — 1) X (/c — 1) sub-matrix of the form [Ak-i + f/fc-i]- Since a < 1, the scalar 
F^E~^F is less than one and the denominator is a positive quantity. Therefore, with Xk-i fixed, 
(pk is maximized when F'^E~^F is maximized. By the induction step, with a fixed Xk-i the 
remaining x/s should be “placed” in decreasing order on the diagonal of E -and U-to maximize 


• Now fix We prove that with Xk -2 < Xk-i, U = diag{xo, xi, • • • , Xk-z, Xk-i, Xk- 2 } 

gives us a larger value for pk- To do this, we consider a different decomposition of the matrix 

[Ak + Uk\, 


Ak + Uk — 


E E 
E^ G 


where now 


xo + 1 a 
a Xi + 1 




Xk-Z + 1/ 


' A’—9 k—^ ' 

a a 




a a 


tG = 


Xk-2 + 1 a 
a Xk-i + 1 


Since F = a^Vk- 2 ), 


Pk = + E~^F [G - E^E-^E] ^ Vk -2 


-2a^[a l)[G-E^E-^E] ^ E^E-Wk-2+[a 1) [G - F^F’^F] 
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Noting that Lpk -2 = V^_ 2 E ^Vk- 2 , 

= «Vfc-2 + aVLs (l a) [G - F^E-^F] 


-1 rl-i I ^ 
a 


- (a. l) [G - M + („ i) [G - F^E-^F] 


1 -1 / '^ 
1 


= aV-2+(a(l-aV;^_2) 1 - aVfc-2) [^ - f 

1 - a‘^(pk-2 


5 = C(a;fc_2,Xfc_i) 

Examining 

{a- a^(pk-2 1 - a^^k- 2 ) 

_ {a - a^(Pk-2fxk-i + (1 - a^v?fc-2)^a^fc-2 + (1 - aVfc-2)(l - aVfc-2)(l - 
Xk-2Xk-l + (1 - Oi'^(pk-2)Xk-2 + (1 - a^(pk-2)Xk-l + (1 - ttVfc-2)(l “ Oi^) 

Checking the two possibilities, ^{xk- 2 -, Xk-i) — i{xk-i-,Xk- 2 ) has the same sign as 


/ Xk-2 + 1 - aVfe-2 

a - a^(pk -2 ^ 

-1 

(a- a^(pk-2\ 

1 a- a^(pk-2 

Xk-i + 1 - aVfc-2 1 


1 1 - a'^^k-2 1 


( 11 ) 


[(a - a^ipk- 2 fxk-i + (1 - a^(pk- 2 fxk -2 + (1 - aVfc- 2 )(l - aVfc- 2 )(l - o?)] 

[xk-2Xk-i + (1 - a^(Pk-2)xk-i + (1 - a^(Pk-2)xk-2 + (1 - aVfc-2)(l - a^)] 

- [(a - a^(pk- 2 fxk -2 + (1 - Q;Vfc- 2 )^a;fc-i + (1 - aVfc- 2 )(l - aVfc- 2 )(l - a^)] 

[xk-2Xk-i + (1 - a‘^^k-2)xk-2 + (1 - a^(pk-2)xk-i + (1 - Q;Vfc-2)(l - a^)] , 

which is zero if a = 1 or Xk -2 = Xk-i- Assuming Xk -2 < Xk-i, it is of the same sign as 


- {l-a^ipl_2)xk-2Xk-i - {l-Oi^‘^k-2){^-a^<^k-2){xk-i + Xk-2) - (1 -aVfc-2)^(l 

which is negative and hence ^{xk- 2 -,Xk-i) < i{xk-i.,Xk- 2 )- We conclude that, when fixing 
{xi}i=o, ^k is maximized when the last two diagonal elements Xk -2 and Xk-i are placed in 
non-increasing order. 

The final step in the proof is to note that if the diagonal entries are not in non-increasing 
order, then either the first (k — l) entries are not or the last two entries are not. This contradicts 
the previous two properties. ■ 
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Before we state and prove the theorem, a eouple of quantities that will come in handy hereafter 
are the partial derivatives of ^{xk- 2 , Xk-i) defined in (fTTI) : 

d 

- -^ oc - a‘^(pk-2fxl_^ (12) 

OXk-2 

oc - [(1 - aVfc- 2 )(l - + (1 - aVfc- 2 )a:fc- 2 ]^ (13) 

OXk-l 

where the expressions above are those of the respective numerators. We note that both quantities 
are non-positive and everything else being constant, the value of ^ decreases as Xk -2 or Xk-i 
increases. 


k-l 

Theorem. When maximizing the scalar 0 over all the choices such that Pj = kPtr, 

j=0 

the maximum is achieved when all the available power is allocated to the last pilot, i.e., Pj = 0, 
for all 0 < j < (fc — 2) and Pk-i = kPtr- 


Proof: We start by imposing a lower bound on the powers {Pj}’s. More precisely, for some 
small enough e > 0, we assume that Pj = e + P' and 

e + Pq 0 ■ ■ ■ 0 

0 -yt+Ti ■■■ 0 

\ Q Q ■" \/^ + -ffc-i/ 

and we optimize over the {Pjj’s subject to the constraint 

k-l 

Pj < kPtr - ke. (14) 

j=0 

The diagonal matrix D is non-singular, allowing us to express the objective function 0 as: 

(j) = V^[A + alD-^D-^] V. 

Applying the result of Lemma [H with U = -and diagonal entries Xj = 0 -%-^^, 

yields that the optimal {Pj}’s have to be non-decreasing. Additionally, the derivative (fTSl) 
indicates that the upperbound (fT4l) will be tight. Indeed, fixing {P0 • • • , Pk_2} (or equivalently 
{xo, • • • ,Xfc_ 2 }) and increasing P^_^ (or equivalently decreasing Xk-i) will increase 0(= (fk)- 
This asserts that the power on the last pilot should be as large as possible so that the upper 
bound (fT4l) is met with equality. 










25 


The derivatives (fT^ and (fT3]) allow us to make an even stronger statement: If {Pq, • • • , 
are fixed, among the ehoiees of P ^_2 and P^_^ sueh that 

fc-3 

PL2 + PLi< kPir -ke-Y,P',= M, 

j=0 

the one that maximizes (^fc is Pfc _2 = 0 and P^_^ = M. 

Indeed, since the bound will be met with equality and Pfc _2 is less or equal to Pfc_i (by the 
result of Lemma [B, we let Pfc _2 = P and P[_i = M — p and optimize over p G [0,M/2]. 
Equivalently, Xk -2 = Xk-i = the derivative of pk 

with respect to p is 

_ dj dxk-2 ^ dxk-i 

dp^^ dxk-2 dp dxk-i dp 

a^(l - aVfc- 2 )^ _ [(1 - aVfc-2)(l - + p)/crlf + (1 - a^Pk-2)f 

(e + p)2(e + M — p)2 (e + p)2(e + M — p)2 

which is of the same sign as 

a{l - a^pk- 2 ) - [(1 - aVfc- 2 )(l - a^)(e + p)Icpn + (1 “ a^Pk- 2 )\ 

< a{l - aVfc- 2 ) - (1 - oi‘^Pk- 2 ) = (-1 - a^Pk-2)0- - «) < 0, 

for any p and therefore the maximum is attained when p = 0. Said differently, pk is maximum 
when P ^_2 = 0 and P[_^ = M. 

By Lemma [B the optimal values of P/ for alH e {0,1, • • • , /c — 3} are less or equal to Pk_ 2 - 
Since for an optimal power allocation Pk _2 is zero, P/ = 0 for alH G (0,1, • • • , fc — 2} and 
Pk-i = kPtr - ke. 

Finally, the same previous observations show that the smaller the e the larger 0 is. Conse¬ 
quently, taking the limit as e goes to zero yields the optimal solution and the proof of the theorem 
is complete. ■ 


References 

[1] A. Lapidoth and S. Shamai, “Fading channels: how perfect need perfect side informationbe?” IEEE Trans, on Inf. Theory, 
vol. 48, no. 5, pp. 1118-1134, May 2002. 

[2] A. Vakili, M. Sharif, and B. Hassibi, “The effect of channel estimation error on the throughput of broadcast channels,” in 
IEEE Inti. Conf. Acoustics, Speech & Sig. Processing, ICASSP, vol. 4, May 2006, pp. IV-IV. 

[3] J. Wang, M. Li, Y. Zhang, and Q. Zhou, “Effect of channel estimation error on the mutual information of mimo fading 
channels,” in Inti. Conf. Wireless Comm., Networking & Mobile Comp., WiCOM, Oct 2008, pp. 1-4. 









26 


[4] A. Lozano, R. Heath, and J. Andrews, “Fundamental limits of cooperation,” IEEE Trans, on Inf. Theory, vol. 59, no. 9, 
pp. 5213-5226, Sept 2013. 

[5] J. K. Cavers, “An Analysis of Pilot Symbol Assisted Modulation for Rayleigh Fading Channels,” IEEE Trans, on Veh. 
TechnoL, vol. 40, no. 11, pp. 686-693, Nov. 1991. 

[6] A. J. Goldsmith and S. G. Chua, “Variable-rate variable-power MQAM for fading channels,” IEEE Trans, on Comm., 
vol. 45, no. 10, pp. 1218-1230, Oct. 1997. 

[7] X. Cai and G. B. Giannakis, “Adaptive PS AM Accounting for Channel Estimation and Prediction Errors,” IEEE Trans, 
on Wireless Comm., vol. 4, no. 1, pp. 246-256, Jan. 2005. 

[8] T. A. Lamahewa, P. Sadeghi, R. Kennedy, and P. Rapajic, “Model-based pilot and data power adaptation in psam with 
periodic delayed feedback,” IEEE Trans, on Wireless Comm., vol. 8, no. 5, pp. 2247-2252, May 2009. 

[9] M. Agarwal, M. Honig, and B. Ata, “Adaptive training for correlated fading channels with feedback,” IEEE Trans, on Inf. 
Theory, vol. 58, no. 8, pp. 5398-5417, Aug 2012. 

[10] X. Cai and G. Giannakis, “Adaptive psam accounting for channel estimation and prediction errors,” IEEE Trans, on Wireless 
Comm., vol. 4, no. 1, pp. 246-256, Jan 2005. 

[11] A. Ekpenyong and Y.-E. Huang, “Eeedback constraints for adaptive transmission,” IEEE Sig. Processing Mag., vol. 24, 
no. 3, pp. 69-78, May 2007. 

[12] S. Misra, A. Swami, and L. Tong, “Cutoff rate optimal binary inputs with imperfect csi,” IEEE Trans, on Wireless Comm., 
vol. 5, no. 10, pp. 2903-2913, Oct 2006. 

[13] S. Akin and M. Gursoy, “Achievable rates and training optimization for fading relay channels with memory,” in Conf. Inf. 
Sc. & Sys, CISS, March 2008, pp. 185-190. 

[14] K. Almustafa, S. Primak, T. Willink, and K. Baddour, “On achievable data rates and optimal power allocation in fading 
channels with imperfect csi,” in Inti. Symp. Wireless Comm. Sys., ISWCS, Oct 2007, pp. 282-286. 

[15] S. Akin and M. Gursoy, “Training optimization for gauss-markov rayleigh fading channels,” in IEEE Inti. Conf. on Comm., 
ICC, June 2007, pp. 5999-6004. 

[16] S. Savazzi and U. Spagnolini, “Optimizing training lengths and training intervals in time-varying fading channels,” IEEE 
Trans, on Sig. Processing, vol. 57, no. 3, pp. 1098-1112, March 2009. 

[17] H. Zhang, S. Wei, G. Ananthaswamy, and D. Goeckel, “Adaptive signaling based on statistical characterizations of outdated 
feedback in wireless communications,” Proc. of the IEEE, vol. 95, no. 12, pp. 2337-2353, Dec 2007. 

[18] D. Duong, B. Holter, and G. Oien, “Optimal pilot spacing and power in rate-adaptive mimo diversity systems with imperfect 
transmitter csi,” in IEEE Workshop on Sig. Processing Adv. in Wireless Comm., June 2005, pp. 47-51. 

[19] A. Maaref and S. Aissa, “Optimized rate-adaptive psam for mimo mrc systems with transmit and receive csi imperfections,” 
IEEE on Trans. Comm., vol. 57, no. 3, pp. 821-830, March 2009. 

[20] I. Abou-Faycal, M. Medard, and U. Madhow, “Binary Adaptive Coded Pilot Symbol Assisted Modulation over Rayleigh 
Fading Channels without Feedback,” IEEE Trans, on Comm., vol. 53, no. 6, pp. 1036-1046, June 2005. 

[21] K. Zeineddine and I. Abou-Faycal, “How Much Training is Optimal in Adaptive PSAM over Markov Rayleigh Fading 
Channels,” IEEE Inti. Symp. Sig. Processing & Inf. Technology, ISSPIT, pp. 366 -371, Dec. 2009. 

[22] M. Medard, “The Effect upon Channel Capacity in Wireless Communications of Perfect and Imperfect Knowledge of the 
Channel,” IEEE Trans, on Inf. Theory, vol. 46, no. 3, pp. 935-946, March 2000. 

[23] W. C. Jakes, Microwave Mobile Communications. New York: Wiley, 1974. 



27 


[24] I. Abou-Faycal, M. D. Trott, and S. Shamai, “The Capacity of Discrete-Time Memoryless Rayleigh-Fading Channels,” 
IEEE Trans, on Inf. Theory, vol. 47, no. 4, pp. 1290-1301, May 2001. 

[25] R. Gallager, “Power Limited Channels: Coding, Multiaccess, and Spread Spectrum,” MIT LIDS, Nov. 1987. 

[26] S. Verdu, “On Channel Capacity per unit cost,” IEEE Trans, on Inf. Theory, vol. 36, no. 5, pp. 1019-1030, Sep. 1990. 

[27] C. Luo, “Communication for Wideband Fading Channels: on Theory and Practice,” Ph.D. dissertation, Massachusetts 
Institute of Technology, Feb. 2006. 

[28] K. E. Baddour and N. C. Beaulieu, “Autoregressive Modeling for Fading Channel Simulation,” IEEE Trans, on Wireless 
Comm., vol. 4, no. 4, pp. 1650-1662, July 2005. 

[29] A. Bdeir, 1. Abou-Faycal, and M. Medard, “Power Allocation Schemes for Pilot Symbol Assisted Modulation over Rayleigh 
Fading Channels with no Feedback,” IEEE Int. Conf. Comm., ICC, vol. 2, pp. 737-741, Jun. 2004. 

[30] K. B. Petersen and M. S. Pedersen, “The matrix cookbook,” Nov 2012, version 20121115. 



